Skip to content

fix: remove empty files left when split-volume encrypted extraction f…#360

Merged
deepin-bot[bot] merged 1 commit intolinuxdeepin:release/eaglefrom
LiHua000:release/eagle
Feb 2, 2026
Merged

fix: remove empty files left when split-volume encrypted extraction f…#360
deepin-bot[bot] merged 1 commit intolinuxdeepin:release/eaglefrom
LiHua000:release/eagle

Conversation

@LiHua000
Copy link
Contributor

@LiHua000 LiHua000 commented Feb 2, 2026

…ails (e.g. wrong password)

Log: as title

Bug: https://pms.uniontech.com/bug-view-343249.html

Summary by Sourcery

Clean up artifacts created during failed archive extraction operations and ensure progress is reported when aborting processing.

Bug Fixes:

  • Remove zero-byte files and empty directories left in the target path when a full extraction fails, such as for split encrypted archives with incorrect passwords.
  • Emit final progress updates before terminating the extraction process when line handling fails to avoid leaving the UI in an indeterminate state.

@sourcery-ai
Copy link

sourcery-ai bot commented Feb 2, 2026

Reviewer's Guide

Adds a cleanup routine that, on failed full-volume extractions (e.g., wrong password on split encrypted archives), removes empty files and then prunes empty directories in the target path, and ensures progress is set to 100% before aborting when line handling fails.

Sequence diagram for extraction failure cleanup and progress update

sequenceDiagram
    actor User
    participant CliInterface
    participant ArchiveProcess
    participant FileSystem

    User->>CliInterface: startExtraction()
    CliInterface->>ArchiveProcess: start()

    loop processOutput
        ArchiveProcess-->>CliInterface: stdout line
        CliInterface->>CliInterface: readStdout(handleAll)
        CliInterface->>CliInterface: handleLine(line, m_workStatus)
        alt handleLine fails
            CliInterface->>CliInterface: signalprogress(100)
            CliInterface->>ArchiveProcess: killProcess()
            note over CliInterface: break
        end
    end

    ArchiveProcess-->>CliInterface: finished(exitCode != 0)
    CliInterface->>CliInterface: extractProcessFinished(exitCode, exitStatus)
    alt full extraction and target path set
        CliInterface->>CliInterface: removeExtractedFilesOnFailure(strTargetPath, m_files)
        CliInterface->>FileSystem: remove empty files in target path
        CliInterface->>FileSystem: prune empty directories in target path
    end
    CliInterface-->>User: report extraction failure
Loading

Updated class diagram for CliInterface extraction failure handling

classDiagram
    class CliInterface {
        +bool moveExtractTempFilesToDest(QList_FileEntry files, ExtractionOptions options)
        +void removeExtractedFilesOnFailure(QString strTargetPath, QList_FileEntry entries)
        +bool handleLongNameExtract(QList_FileEntry files)
        +void readStdout(bool handleAll)
        +void extractProcessFinished(int exitCode, QProcess_ExitStatus exitStatus)

        -ExtractionOptions m_extractOptions
        -QList_FileEntry m_files
        -WorkStatus m_workStatus
        -bool m_listEmptyLines
    }

    class ExtractionOptions {
        +bool bAllExtract
        +QString strTargetPath
        +QString strDestination
    }

    class FileEntry {
        +QString strFullPath
        +bool isDirectory
    }

    class DataManager {
        +static DataManager get_instance()
        +ArchiveData archiveData()
    }

    class ArchiveData {
        +QMap_Int_FileEntry mapFileEntry
    }

    CliInterface --> ExtractionOptions
    CliInterface --> FileEntry
    DataManager --> ArchiveData
    ArchiveData --> FileEntry
Loading

Flow diagram for removeExtractedFilesOnFailure cleanup routine

flowchart TD
    A[Start removeExtractedFilesOnFailure] --> B{entries is empty?}
    B -- Yes --> C[entries = ArchiveData.mapFileEntry.values]
    B -- No --> D[listToRemove = entries]
    C --> E{listToRemove is empty?}
    D --> E{listToRemove is empty?}
    E -- Yes --> Z[Return]
    E -- No --> F[Create QDir targetDir from strTargetPath]
    F --> G{targetDir exists?}
    G -- No --> Z
    G -- Yes --> H[Build paths list from listToRemove
relPath = entry.strFullPath
trim trailing slash
skip empty relPath
store absolute path and isDirectory]
    H --> I[For each path with isDirectory == false
if file exists and size == 0
remove file]
    I --> J[removed = false]
    J --> K[For each path with isDirectory == true
if directory exists and isEmpty
removeRecursively
set removed = true]
    K --> L{removed is true?}
    L -- Yes --> J
    L -- No --> Z
    Z[End removeExtractedFilesOnFailure]
Loading

File-Level Changes

Change Details Files
Introduce a post-failure extraction cleanup that removes zero-size files and prunes now-empty directories in the extraction target.
  • Add CliInterface::removeExtractedFilesOnFailure helper that derives the affected file list (either from given entries or ArchiveData), builds absolute paths from archive entries, deletes zero-length files, and iteratively removes empty directories until none remain.
  • Invoke the cleanup helper in extractProcessFinished when extraction fails, is a full extract, and a non-empty target path is configured.
3rdparty/interface/archiveinterface/cliinterface.cpp
3rdparty/interface/archiveinterface/cliinterface.h
Adjust extraction output handling to always complete progress reporting on parse failure before killing the process.
  • When handleLine returns false in readStdout, emit progress 100% before killing the process so UI progress reflects completion even on early termination.
3rdparty/interface/archiveinterface/cliinterface.cpp

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In removeExtractedFilesOnFailure, consider building a set of unique directory paths (and possibly sorting them deepest-first) instead of repeatedly scanning the full paths list in a do/while loop, to avoid redundant checks on large archives.
  • When removing directories and files, you may want to explicitly handle symlinks (e.g., using QFileInfo::isSymLink) to avoid unintentionally following or deleting outside the intended extraction tree.
  • The logic for determining which entries to clean up falls back to all archive entries when the provided list is empty; consider documenting or enforcing whether this behavior is always safe for partial extractions or mixed operations to avoid over-deletion in edge cases.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In removeExtractedFilesOnFailure, consider building a set of unique directory paths (and possibly sorting them deepest-first) instead of repeatedly scanning the full paths list in a do/while loop, to avoid redundant checks on large archives.
- When removing directories and files, you may want to explicitly handle symlinks (e.g., using QFileInfo::isSymLink) to avoid unintentionally following or deleting outside the intended extraction tree.
- The logic for determining which entries to clean up falls back to all archive entries when the provided list is empty; consider documenting or enforcing whether this behavior is always safe for partial extractions or mixed operations to avoid over-deletion in edge cases.

## Individual Comments

### Comment 1
<location> `3rdparty/interface/archiveinterface/cliinterface.cpp:1130-1132` </location>
<code_context>
+
+    for (const auto &p : paths) {
+        const QString &path = p.first;
+        if (!p.second) {   // 文件
+            QFileInfo fi(path);
+            if (fi.exists() && fi.isFile() && fi.size() == 0) {
+                QFile::remove(path);
+            }
</code_context>

<issue_to_address>
**issue (bug_risk):** Deleting all zero-size files on failure may remove legitimately empty files from the archive.

This will delete all zero-sized files in `paths`, including ones that are intentionally empty and were successfully written before the failure. To avoid removing valid files, consider tracking which files were actually created/modified during the failed extraction (e.g., via a temp name/extension, a process-local list, or a timestamp window) instead of relying solely on `size() == 0`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +1130 to +1132
if (!p.second) { // 文件
QFileInfo fi(path);
if (fi.exists() && fi.isFile() && fi.size() == 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Deleting all zero-size files on failure may remove legitimately empty files from the archive.

This will delete all zero-sized files in paths, including ones that are intentionally empty and were successfully written before the failure. To avoid removing valid files, consider tracking which files were actually created/modified during the failed extraction (e.g., via a temp name/extension, a process-local list, or a timestamp window) instead of relying solely on size() == 0.

@deepin-ci-robot
Copy link

deepin pr auto review

这段代码新增了一个 removeExtractedFilesOnFailure 函数,用于在解压失败时清理残留文件,并在解压流程结束时调用它。以下是对代码的审查意见,包括语法逻辑、代码质量、代码性能和代码安全四个方面:

1. 语法逻辑

  • 清理逻辑的局限性
    removeExtractedFilesOnFailure 函数中,文件清理的条件是 fi.size() == 0。逻辑注释中提到是“清理 size 为 0 等残留文件”。然而,如果解压过程中断,部分文件可能已经写入了部分数据(非0大小),这些文件目前不会被清理。这可能导致用户目录中留下损坏或不完整的文件。

    • 建议:评估是否需要清理所有在解压失败期间创建的文件,而不仅仅是大小为0的文件。如果只清理0字节文件,应确保这符合所有解压失败场景(如密码错误、磁盘满、中断等)的需求。
  • 目录删除的循环逻辑
    代码使用 do...while 循环来处理多层嵌套的空目录。逻辑上是可以工作的,但 d.removeRecursively() 本身就是递归删除目录及其内容的。如果目录是空的,rmdir() 更轻量。如果目录不为空,removeRecursively 会删除非空目录,这可能不是初衷(只删除空目录)。

    • 建议:如果目标是删除空目录,建议使用 QDir::rmdir()。如果目录不为空,rmdir() 会失败并返回 false,这正是我们想要的效果(防止误删用户文件)。
    • 改进代码
      if (d.exists() && d.isEmpty()) {
          d.rmdir(); // 或者 d.removeRecursively(),取决于是否想强制删除空目录下的隐藏文件等
          removed = true;
      }
  • 路径处理
    relPath.chop(1) 用于去除末尾的斜杠,这是正确的。但 QDir::absoluteFilePath 会处理路径拼接,确保逻辑正确。

2. 代码质量

  • 变量命名
    listToRemovepaths 命名比较通用。

    • 建议listToRemove 可以改为 entriesToProcesspaths 可以改为 fileSystemPaths,更具描述性。
  • 代码复用与结构
    paths 列表构建和删除逻辑分成了两个循环。虽然逻辑清晰,但可以稍微优化结构。

    • 建议:目前的结构(先收集,后处理)是合理的,因为它避免了在遍历列表时修改文件系统可能带来的副作用。
  • 注释
    代码中添加了中文注释,这很好。但函数头部的 Doxygen 注释非常完善,建议在关键逻辑(如 fi.size() == 0 判断)处也添加简短注释,解释为什么只删除大小为0的文件。

3. 代码性能

  • 文件系统操作
    do...while 循环中,每次迭代都会遍历所有目录路径并检查 d.isEmpty()。如果有大量目录,这可能会比较慢,因为 isEmpty() 通常需要读取目录内容。
    • 建议:目前的实现在解压失败这种低频操作下是可以接受的。但如果解压文件数量极大,可以考虑对目录路径按深度排序(从深到浅),然后只需遍历一次,先删除深层目录,再删除浅层目录。这样可以避免 do...while 循环。

4. 代码安全

  • 路径遍历风险
    entry.strFullPath 来自压缩包内部的数据。虽然 QDir::absoluteFilePath 会规范化路径,但仍需警惕恶意压缩包包含 ../ 等路径跳转字符,导致文件被写到解压目标目录之外。

    • 建议:在拼接路径前,验证 entry.strFullPath 是否包含路径遍历字符(如 ../)。或者确保 QDir::absoluteFilePath 的结果确实以 strTargetPath 开头。
    • 改进代码示例
      QString absPath = targetDir.absoluteFilePath(relPath);
      if (!absPath.startsWith(targetDir.absolutePath())) {
          continue; // 跳过或记录错误,防止路径遍历攻击
      }
  • 空指针与异常
    DataManager::get_instance() 的调用假设单例始终有效。如果 archiveData() 返回空引用或空指针,后续调用可能会崩溃。

    • 建议:虽然通常单例是安全的,但保持防御性编程总是好的。
  • 信号发射
    readStdout 中新增了 emit signalprogress(100);。这通常用于通知UI进度结束。但在 killProcess() 之前发射,可能会导致状态不一致(进程被杀死了,但进度显示100%)。

    • 建议:确认业务逻辑是否希望在失败时强制将进度设为100。通常失败时可能需要发射错误信号或保持当前进度。

修改后的代码建议(结合部分建议):

void CliInterface::removeExtractedFilesOnFailure(const QString &strTargetPath, const QList<FileEntry> &entries)
{
    QList<FileEntry> entriesToProcess = entries;
    if (entriesToProcess.isEmpty()) {
        // 防御性检查:确保单例和数据有效
        if (DataManager::get_instance().isValid()) {
            entriesToProcess = DataManager::get_instance().archiveData().mapFileEntry.values();
        }
    }
    if (entriesToProcess.isEmpty()) {
        return;
    }

    QDir targetDir(strTargetPath);
    if (!targetDir.exists()) {
        return;
    }

    // 使用 QPair 存储 (绝对路径, 是否是目录)
    QList<QPair<QString, bool>> fileSystemPaths;
    QString targetAbsPath = targetDir.absolutePath();

    for (const FileEntry &entry : entriesToProcess) {
        QString relPath = entry.strFullPath;
        if (relPath.endsWith(QLatin1Char('/'))) {
            relPath.chop(1);
        }
        if (relPath.isEmpty()) {
            continue;
        }

        QString absPath = targetDir.absoluteFilePath(relPath);
        
        // 安全检查:防止路径遍历攻击
        if (!absPath.startsWith(targetAbsPath)) {
            continue;
        }

        fileSystemPaths.append(qMakePair(absPath, entry.isDirectory));
    }

    // 1. 清理残留文件(当前逻辑:仅清理大小为0的文件)
    for (const auto &pair : fileSystemPaths) {
        if (!pair.second) { // 是文件
            QFileInfo fi(pair.first);
            // 仅当文件存在、确实是文件且大小为0时删除
            if (fi.exists() && fi.isFile() && fi.size() == 0) {
                QFile::remove(pair.first);
            }
        }
    }

    // 2. 清理空目录
    // 为了性能优化,可以考虑对目录按深度排序后删除,这里保持原有循环逻辑但使用 rmdir
    bool removed;
    do {
        removed = false;
        for (const auto &pair : fileSystemPaths) {
            if (pair.second) { // 是目录
                QDir dir(pair.first);
                // rmdir 仅删除空目录,更安全
                if (dir.exists() && dir.isEmpty()) {
                    if (dir.rmdir()) {
                        removed = true;
                    }
                }
            }
        }
    } while (removed);
}

总结

这段代码主要解决了在解压失败(特别是分卷加密包密码错误)时清理空文件的问题。整体逻辑清晰,但在**安全性(路径遍历)健壮性(文件清理条件)**方面可以进一步加强。建议采纳上述关于路径验证和目录删除方式的建议。

@deepin-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: LiHua000, max-lvs

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@LiHua000
Copy link
Contributor Author

LiHua000 commented Feb 2, 2026

/merge

@deepin-bot deepin-bot bot merged commit 748509c into linuxdeepin:release/eagle Feb 2, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants